-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyper log log plus plus(HLL++) #2522
base: branch-25.02
Are you sure you want to change the base?
Conversation
b6f5cf5
to
526a61f
Compare
src/main/cpp/src/HLLPP.cu
Outdated
rmm::cuda_stream_view stream, | ||
rmm::device_async_resource_ref mr) | ||
{ | ||
CUDF_EXPECTS(precision >= 4 && precision <= 18, "HLL++ requires precision in range: [4, 18]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use std::numeric_limits<>::digits
instead of hardcoded values 4
and 18
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/main/cpp/src/HLLPP.cu
Outdated
auto input_cols = std::vector<int64_t const*>(input_iter, input_iter + input.num_children()); | ||
auto d_inputs = cudf::detail::make_device_uvector_async(input_cols, stream, mr); | ||
auto result = cudf::make_numeric_column( | ||
cudf::data_type{cudf::type_id::INT64}, input.size(), cudf::mask_state::ALL_VALID, stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need such all-valid null mask? How about cudf::mask_state::UNALLOCATED
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested Spark behavior, for approx_count_distinct(null)
returns 0.
So the values in result column are always non-null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant, if all rows are valid, we don't need to allocate a null mask.
BTW, we need to pass mr
to the returning column (but do not pass it to the intermediate vector/column).
cudf::data_type{cudf::type_id::INT64}, input.size(), cudf::mask_state::ALL_VALID, stream); | |
cudf::data_type{cudf::type_id::INT64}, input.size(), cudf::mask_state::UNALLOCATED, stream, mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/main/cpp/src/HLLPP.cu
Outdated
auto result = cudf::make_numeric_column( | ||
cudf::data_type{cudf::type_id::INT64}, input.size(), cudf::mask_state::ALL_VALID, stream); | ||
// evaluate from struct<long, ..., long> | ||
thrust::for_each_n(rmm::exec_policy(stream), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to use exec_policy_nosync
as much as possible.
thrust::for_each_n(rmm::exec_policy(stream), | |
thrust::for_each_n(rmm::exec_policy_nosync(stream), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* The input sketch values must be given in the format `LIST<INT8>`. | ||
* | ||
* @param input The sketch column which constains `LIST<INT8> values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INT8
or INT64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, in estimate_from_hll_sketches
I see that the input is STRUCT<LONG, LONG, ....>
instead of LIST<>
. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's STRUCT<LONG, LONG, ....> consistent with Spark. The input is columnar data, e.g.: sketch 0 is composed of by all the data of the children at index 0.
Updated the function comments, refer to commit
Signed-off-by: Chong Gao <[email protected]>
Ready to review except test cases. |
src/main/cpp/CMakeLists.txt
Outdated
@@ -196,6 +196,7 @@ add_library( | |||
src/HashJni.cpp | |||
src/HistogramJni.cpp | |||
src/HostTableJni.cpp | |||
src/HLLPPJni.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to be generic.
src/HLLPPJni.cpp | |
src/AggregationJni.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to: HLLPPHostUDFJni
AggregationJni is too generic
src/main/cpp/CMakeLists.txt
Outdated
@@ -204,6 +205,7 @@ add_library( | |||
src/SparkResourceAdaptorJni.cpp | |||
src/SubStringIndexJni.cpp | |||
src/ZOrderJni.cpp | |||
src/HLLPP.cu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about HyperLogLogPP
?
src/HLLPP.cu | |
src/HyperLogLogPP.cu |
This name is also applied for the .hpp
and *.java
files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/main/cpp/src/HLLPP.cu
Outdated
@@ -0,0 +1,102 @@ | |||
/* | |||
* Copyright (c) 2023-2024, NVIDIA CORPORATION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Copyright (c) 2023-2024, NVIDIA CORPORATION. | |
* Copyright (c) 2024-2025, NVIDIA CORPORATION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/main/cpp/src/HLLPP.cu
Outdated
int64_t shift_mask = MASK << (REGISTER_VALUE_BITS * reg_idx); | ||
int64_t v = (long_10_registers & shift_mask) >> (REGISTER_VALUE_BITS * reg_idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int64_t shift_mask = MASK << (REGISTER_VALUE_BITS * reg_idx); | |
int64_t v = (long_10_registers & shift_mask) >> (REGISTER_VALUE_BITS * reg_idx); | |
auto const shift_bits = REGISTER_VALUE_BITS * reg_idx; | |
auto const shift_mask = MASK << shift_bits; | |
auto const v = (long_10_registers & shift_mask) >> shift_bit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/main/cpp/src/HLLPP.cu
Outdated
} | ||
|
||
struct estimate_fn { | ||
cudf::device_span<int64_t const*> sketch_longs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudf::device_span<int64_t const*> sketch_longs; | |
cudf::device_span<int64_t const*> sketches; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/cpp/src/HLLPP.cu
Outdated
int const precision; | ||
int64_t* const out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now favor non-const members so the functor can be moved by the compiler if needed.
In addition, member variables need to be sorted by their sizes to reduce padding.
int const precision; | |
int64_t* const out; | |
int64_t* out; | |
int precision; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/cpp/src/HLLPP.cu
Outdated
|
||
__device__ void operator()(cudf::size_type const idx) const | ||
{ | ||
auto const num_regs = 1ull << precision; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be used to compare with signed int later, thus it should not be unsigned here.
auto const num_regs = 1ull << precision; | |
auto const num_regs = 1 << precision; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/cpp/src/HLLPP.cu
Outdated
rmm::cuda_stream_view stream, | ||
rmm::device_async_resource_ref mr) | ||
{ | ||
CUDF_EXPECTS(precision >= 4, "HyperLogLogPlusPlus requires precision is bigger than 4."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDF_EXPECTS(precision >= 4, "HyperLogLogPlusPlus requires precision is bigger than 4."); | |
CUDF_EXPECTS(precision >= 4, "HyperLogLogPlusPlus requires precision bigger than 4."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/cpp/src/HLLPP.cu
Outdated
auto const input_iter = cudf::detail::make_counting_transform_iterator( | ||
0, [&](int i) { return input.child(i).begin<int64_t>(); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a CUDF_EXPECTS
to check for input type too (struct of longs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Now all the outer functions check:
CUDF_EXPECTS(input.type().id() == cudf::type_id::STRUCT,
"HyperLogLogPlusPlus buffer type must be a STRUCT of long columns.");
for (auto i = 0; i < input.num_children(); i++) {
CUDF_EXPECTS(input.child(i).type().id() == cudf::type_id::INT64,
"HyperLogLogPlusPlus buffer type must be a STRUCT of long columns.");
}
build |
Verified Host UDF successfully via NVIDIA/spark-rapids#11638 |
Need to wait for the dependencies to be merged first before we can build. |
int64_t const precision, // num of bits for register addressing, e.g.: 9 | ||
int* const registers_output_cache, // num is num_groups * num_registers_per_sketch | ||
int* const registers_thread_cache, // num is num_threads * num_registers_per_sketch | ||
cudf::size_type* const group_lables_thread_cache // save the group lables for each thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* sketch. Input is a struct column with multiple long columns which is | ||
* consistent with Spark. Output is a struct scalar with multiple long values. | ||
*/ | ||
Reduction_MERGE(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming convention should be consistent with GroupByMerge
.
Reduction_MERGE(1), | |
ReductionMerge(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/** | ||
* HyperLogLogPlusPlus(HLLPP) host UDF aggregation utils | ||
*/ | ||
public class HyperLogLogPlusPlusHostUDF { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the Java interface has changed. Please reimplement this similar to https://github.com/NVIDIA/spark-rapids-jni/pull/2631/files#diff-3bf8ba05afd52e4ef36fa2c0431304bbc88bc07cafd976665e17113464811392R24-R41.
switch (agg_type) { | ||
case 0: return spark_rapids_jni::create_hllpp_reduction_host_udf(precision); | ||
case 1: return spark_rapids_jni::create_hllpp_reduction_merge_host_udf(precision); | ||
case 2: return spark_rapids_jni::create_hllpp_groupby_host_udf(precision); | ||
default: return spark_rapids_jni::create_hllpp_groupby_merge_host_udf(precision); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch (agg_type) { | |
case 0: return spark_rapids_jni::create_hllpp_reduction_host_udf(precision); | |
case 1: return spark_rapids_jni::create_hllpp_reduction_merge_host_udf(precision); | |
case 2: return spark_rapids_jni::create_hllpp_groupby_host_udf(precision); | |
default: return spark_rapids_jni::create_hllpp_groupby_merge_host_udf(precision); | |
switch (agg_type) { | |
case 0: return spark_rapids_jni::create_hllpp_reduction_host_udf(precision); | |
case 1: return spark_rapids_jni::create_hllpp_reduction_merge_host_udf(precision); | |
case 2: return spark_rapids_jni::create_hllpp_groupby_host_udf(precision); | |
case 3: return spark_rapids_jni::create_hllpp_groupby_merge_host_udf(precision); | |
default: CUDF_FAIL("Invalid aggregation type."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/** | ||
* The number of bits that is required for a HLLPP register value. | ||
* | ||
* This number is determined by the maximum number of leading binary zeros a | ||
* hashcode can produce. This is equal to the number of bits the hashcode | ||
* returns. The current implementation uses a 64-bit hashcode, this means 6-bits | ||
* are (at most) needed to store the number of leading zeros. | ||
*/ | ||
constexpr int REGISTER_VALUE_BITS = 6; | ||
|
||
// MASK binary 6 bits: 111-111 | ||
constexpr uint64_t MASK = (1L << REGISTER_VALUE_BITS) - 1L; | ||
|
||
// This value is 10, one long stores 10 register values | ||
constexpr int REGISTERS_PER_LONG = 64 / REGISTER_VALUE_BITS; | ||
|
||
// XXHash seed, consistent with Spark | ||
constexpr int64_t SEED = 42L; | ||
|
||
// max precision, if require a precision bigger than 18, then use 18. | ||
constexpr int MAX_PRECISION = 18; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these values need to be exposed to the public? Otherwise, please move them to the source file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, moved to source file.
template <typename cudf_aggregation> | ||
struct hllpp_udf : cudf::host_udf_base { | ||
static_assert(std::is_same_v<cudf_aggregation, cudf::reduce_aggregation> || | ||
std::is_same_v<cudf_aggregation, cudf::groupby_aggregation>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template <typename cudf_aggregation> | |
struct hllpp_udf : cudf::host_udf_base { | |
static_assert(std::is_same_v<cudf_aggregation, cudf::reduce_aggregation> || | |
std::is_same_v<cudf_aggregation, cudf::groupby_aggregation>); | |
struct hllpp_udf : cudf::groupby_host_udf, cudf::reduce_host_udf { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
[[nodiscard]] output_type operator()(host_udf_input const& udf_input, | ||
rmm::cuda_stream_view stream, | ||
rmm::device_async_resource_ref mr) const override | ||
{ | ||
if constexpr (std::is_same_v<cudf_aggregation, cudf::reduce_aggregation>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new interfaces, this needs to be separated into two separate operator()
functions for reduction/groupby.
[[nodiscard]] output_type get_empty_output( | ||
[[maybe_unused]] std::optional<cudf::data_type> output_dtype, | ||
rmm::cuda_stream_view stream, | ||
rmm::device_async_resource_ref mr) const override | ||
{ | ||
int num_registers = 1 << precision; | ||
int num_long_cols = num_registers / REGISTERS_PER_LONG + 1; | ||
auto const results_iter = cudf::detail::make_counting_transform_iterator( | ||
0, [&](int i) { return cudf::make_empty_column(cudf::data_type{cudf::type_id::INT64}); }); | ||
auto children = | ||
std::vector<std::unique_ptr<cudf::column>>(results_iter, results_iter + num_long_cols); | ||
|
||
if constexpr (std::is_same_v<cudf_aggregation, cudf::reduce_aggregation>) { | ||
// reduce | ||
auto host_results_view_iter = thrust::make_transform_iterator( | ||
children.begin(), [](auto const& results_column) { return results_column->view(); }); | ||
auto views = std::vector<cudf::column_view>(host_results_view_iter, | ||
host_results_view_iter + num_long_cols); | ||
auto table_view = cudf::table_view{views}; | ||
auto table = cudf::table(table_view); | ||
return std::make_unique<cudf::struct_scalar>(std::move(table), true, stream, mr); | ||
} else { | ||
// groupby | ||
return cudf::make_structs_column(0, | ||
std::move(children), | ||
0, // null count | ||
rmm::device_buffer{}, // null mask | ||
stream, | ||
mr); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[[nodiscard]] output_type get_empty_output( | |
[[maybe_unused]] std::optional<cudf::data_type> output_dtype, | |
rmm::cuda_stream_view stream, | |
rmm::device_async_resource_ref mr) const override | |
{ | |
int num_registers = 1 << precision; | |
int num_long_cols = num_registers / REGISTERS_PER_LONG + 1; | |
auto const results_iter = cudf::detail::make_counting_transform_iterator( | |
0, [&](int i) { return cudf::make_empty_column(cudf::data_type{cudf::type_id::INT64}); }); | |
auto children = | |
std::vector<std::unique_ptr<cudf::column>>(results_iter, results_iter + num_long_cols); | |
if constexpr (std::is_same_v<cudf_aggregation, cudf::reduce_aggregation>) { | |
// reduce | |
auto host_results_view_iter = thrust::make_transform_iterator( | |
children.begin(), [](auto const& results_column) { return results_column->view(); }); | |
auto views = std::vector<cudf::column_view>(host_results_view_iter, | |
host_results_view_iter + num_long_cols); | |
auto table_view = cudf::table_view{views}; | |
auto table = cudf::table(table_view); | |
return std::make_unique<cudf::struct_scalar>(std::move(table), true, stream, mr); | |
} else { | |
// groupby | |
return cudf::make_structs_column(0, | |
std::move(children), | |
0, // null count | |
rmm::device_buffer{}, // null mask | |
stream, | |
mr); | |
} | |
} | |
[[nodiscard]] std::unique_ptr<cudf::column> get_empty_output( | |
rmm::cuda_stream_view stream, | |
rmm::device_async_resource_ref mr) const override | |
{ | |
int num_registers = 1 << precision; | |
int num_long_cols = num_registers / REGISTERS_PER_LONG + 1; | |
auto const results_iter = cudf::detail::make_counting_transform_iterator( | |
0, [&](int i) { return cudf::make_empty_column(cudf::data_type{cudf::type_id::INT64}); }); | |
auto children = | |
std::vector<std::unique_ptr<cudf::column>>(results_iter, results_iter + num_long_cols); | |
return cudf::make_structs_column(0, | |
std::move(children), | |
0, // null count | |
rmm::device_buffer{}, // null mask | |
stream, | |
mr); | |
} |
The interface for |
[[nodiscard]] input_data_attributes get_required_data() const override | ||
{ | ||
if constexpr (std::is_same_v<cudf_aggregation, cudf::reduce_aggregation>) { | ||
return {reduction_data_attribute::INPUT_VALUES}; | ||
} else { | ||
return {groupby_data_attribute::GROUPED_VALUES, | ||
groupby_data_attribute::GROUP_OFFSETS, | ||
groupby_data_attribute::GROUP_LABELS}; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new interface does not have this function anymore.
[[nodiscard]] input_data_attributes get_required_data() const override | |
{ | |
if constexpr (std::is_same_v<cudf_aggregation, cudf::reduce_aggregation>) { | |
return {reduction_data_attribute::INPUT_VALUES}; | |
} else { | |
return {groupby_data_attribute::GROUPED_VALUES, | |
groupby_data_attribute::GROUP_OFFSETS, | |
groupby_data_attribute::GROUP_LABELS}; | |
} | |
} |
auto const& input_values = | ||
std::get<cudf::column_view>(udf_input.at(reduction_data_attribute::INPUT_VALUES)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new interface, the input
column is passed as function parameter.
auto const& input_values = | |
std::get<cudf::column_view>(udf_input.at(reduction_data_attribute::INPUT_VALUES)); |
auto const& group_values = | ||
std::get<cudf::column_view>(udf_input.at(groupby_data_attribute::GROUPED_VALUES)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto const& group_values = | |
std::get<cudf::column_view>(udf_input.at(groupby_data_attribute::GROUPED_VALUES)); | |
auto const group_values = get_grouped_values(); |
auto const group_offsets = std::get<cudf::device_span<cudf::size_type const>>( | ||
udf_input.at(groupby_data_attribute::GROUP_OFFSETS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto const group_offsets = std::get<cudf::device_span<cudf::size_type const>>( | |
udf_input.at(groupby_data_attribute::GROUP_OFFSETS)); | |
auto const group_offsets = get_group_offsets(); |
auto const group_labels = std::get<cudf::device_span<cudf::size_type const>>( | ||
udf_input.at(groupby_data_attribute::GROUP_LABELS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto const group_labels = std::get<cudf::device_span<cudf::size_type const>>( | |
udf_input.at(groupby_data_attribute::GROUP_LABELS)); | |
auto const group_labels = get_group_labels(); |
} | ||
|
||
/** | ||
* @brief create an empty struct scalar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @brief create an empty struct scalar | |
* @brief Create an empty column when the input is empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't read through all of the code, I don't know the C++ well enough to feel like it would be good for me to review it. But we do need some kind of test added here to at least show that the code is minimally working. We should not rely only on spark-rapids to verify the code.
}(); | ||
CUDF_EXPECTS(udf_ptr != nullptr, "Invalid HyperLogLogPlusPlus(HLLPP) UDF instance."); | ||
|
||
return reinterpret_cast<jlong>(udf_ptr.release()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this pointer released? It looks like this is a memory leak after this.
/** | ||
* Create a HyperLogLogPlusPlus(HLLPP) host UDF | ||
*/ | ||
public static long createHLLPPHostUDF(AggregationType type, int precision) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this API leaks. We might want to think about how to redesign this and the java APIs to be more robust.
At a minimum we need an API that will let us free the HostUDF aggregation when we are done with it, but I really would prefer to have a class that acts as a wrapper around the long which is an AutoCloseable so we can use the withResource API in spark-rapids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I forgot about closing it. Yes, we can auto-close the wrapper class https://github.com/rapidsai/cudf/blob/bbf4f7824c23c0c482f52bafdf1ece1213da2f65/java/src/main/java/ai/rapids/cudf/HostUDFWrapper.java#L28.
I'll add a fix for it shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix by rapidsai/cudf#17727.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this class must extend HostUPFWrapper
and override the required methods.
Add support for Hyper log log plus plus(HLL++)
Depends on:
HOST_UDF
aggregation for groupby rapidsai/cudf#17592HOST_UDF
aggregation for reduction and segmented reduction rapidsai/cudf#17645Signed-off-by: Chong Gao [email protected]